Another look at linear programming for feature selection via methods of regularization
نویسندگان
چکیده
We consider statistical procedures for feature selection defined by a family of regularization problems with convex piecewise linear loss functions and penalties of l1 nature. Many known statistical procedures (e.g. quantile regression and support vector machines with l1 norm penalty) are subsumed under this category. Computationally, the regularization problems are linear programming (LP) problems indexed by a single parameter, which are known as ‘parametric cost LP’ or ‘parametric right-hand-side LP’ in the optimization theory. Exploiting the connection with the LP theory, we lay out general algorithms, namely, the simplex algorithm and its variant for generating regularized solution paths for the feature selection problems. The significance of such algorithms is that they allow a complete exploration of the model space along the paths and provide a broad view of persistent features in the data. The implications of the general path-finding algorithms are outlined for a few statistical procedures, and they are illustrated with numerical examples.
منابع مشابه
A stochastic model for project selection and scheduling problem
Resource limitation in zero time may cause to some profitable projects not to be selected in project selection problem, thus simultaneous project portfolio selection and scheduling problem has received significant attention. In this study, budget, investment costs and earnings are considered to be stochastic. The objectives are maximizing net present values of selected projects and minimizing v...
متن کاملFast Feature Selection from Microarray Expression Data via Multiplicative Large Margin Algorithms
New feature selection algorithms for linear threshold functions are described which combine backward elimination with an adaptive regularization method. This makes them particularly suitable to the classification of microarray expression data, where the goal is to obtain accurate rules depending on few genes only. Our algorithms are fast and easy to implement, since they center on an incrementa...
متن کاملSOLVING FUZZY LINEAR PROGRAMMING PROBLEMS WITH LINEAR MEMBERSHIP FUNCTIONS-REVISITED
Recently, Gasimov and Yenilmez proposed an approach for solving two kinds of fuzzy linear programming (FLP) problems. Through the approach, each FLP problem is first defuzzified into an equivalent crisp problem which is non-linear and even non-convex. Then, the crisp problem is solved by the use of the modified subgradient method. In this paper we will have another look at the earlier defuzzifi...
متن کاملTowards Feature Selection in Networks
Traditional feature selection methods assume that the data are independent and identically distributed (i.i.d.). In real world, tremendous amounts of data are distributed in a network. Existing features selection methods are not suited for networked data because the i.i.d. assumption no longer holds. This motivates us to study feature selection in a network. In this paper, we present a supervis...
متن کاملAutomatic Smoothing and Variable Selection via Regularization
This thesis focuses on developing computational methods and the general theory of automatic smoothing and variable selection via regularization. Methods of regularization are a commonly used technique to get stable solution to ill-posed problems such as nonparametric regression and classification. In recent years, methods of regularization have also been successfully introduced to address a cla...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Statistics and Computing
دوره 24 شماره
صفحات -
تاریخ انتشار 2014